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Abstract 



Several techniques have appeared in the literature to solve the equations 
of time-dependent density functional theory. We compare the efficiency of dif- 
ferent methods based on mesh representations of the wave functions (direct 
and Fourier space), taking as a test case the calculation of the surface plas- 
mon in the cluster Nag. For smaller systems, the methods have comparable 
efficiency. For large systems the direct time method has a decided advantage 
in computer storage requirements. It is also more economical on arithmetic 
operations, but is not as suited for parallel computing as the methods based 
on a frequency representation. 
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I. INTRODUCTION 

The time-dependent local density approximation has proven to be a useful tool to cal- 
culate the optical properties of finite systems such as atoms, molecules, and atomic clusters 
Jl|-^[|. The basic equation to be solved is conceptually very simple, little more than the 
time-dependent Schrodinger equation for a particle in a time-varying external field. Many 
numerical methods are in use to solve the equations. On the one side there are quantum 
chemistry methods based on atomic orbital representation for the wave function, and on 
another side there are methods based on mesh representations. We only consider the latter 
here, but even in this category there are a number of published techniques. Most funda- 
mentally, the time evolution can be calculated directly or in Fourier space, i.e. in terms 
of frequencies. The former method is practically a necessity for dealing with very strong 
external fields |l~0|JTT| and has been applied by two of us (K.Y. and G.B.) for the weak-field 



response as well Q . We shall call this approach the "nuclear physics" (NP) method, since 



the algorithms were originally developed in that field for describing nuclear reactions |L2 
The other methods we will consider |3||| solve equations in frequency space. The method 
described in ref. @ had its origins in condensed matter theory and uses Fourier represen- 
tation for both space and time; we shall call this the "condensed matter" (CMP) method. 
We also comment on ref. || which uses Fourier space for the time but a real space mesh for 



the spatial dependence [13]. Here the problem is cast into a matrix diagonalization in the 
particle-hole representation; we shall call it the diagonalization method. 

In this work we will compare the CMP code and the NP code for a specific system and 
present arguments for the scaling properties of the respective algorithms for larger systems. 
The system we choose to study is the atomic cluster Na 8 , and in particular the surface 
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plasmon excitation which is seen as a strong peak at 2.5 eV excitation. The TDLDA is not 
an exact theory and it predicts a excitation energy at about 2.7 eV. We shall demand of 
both methods that they achieve within 0.1 eV of the converged value. It makes little sense 
to calculate to higher precision in view of the intrinsic limitations of the theory. 

We shall now describe the various methods from a computational point of view. We shall 
use the symbol N with subscripts for quantities that scale roughly as the size of the physical 
system under study, and M for quantities that may be large but are independent of the size 
of the system. Important quantities common to the two codes are the number of electrons N e 
and the number of mesh points, No and Nr for real space and reciprocal space, respectively. 
Additional quantities that play a role are the number of frequencies to be calculated M w , and 
the number of time steps to evolute the wave function in the real-time method, My. Also, in 
methods that rely on sparse matrix multiplication, we need the number of nonzero entries in 
a row of the Hamiltonian, Mh, and in iterative methods to solve large matrix equations we 
need the number of iterations for convergence, M it . Finally, the response function method 
usually requires a sum over unoccupied states, N c . This notation is summarized in Table I. 

We will use same energy functional for all methods, so the choice of specific functional is 
not an issue in comparing the methods. As is commonly done, we calculate only the dynamics 
of the valence electrons. The core electrons are frozen and their presence is treated by using 
a pseudopotential to describe the ionic potential. We use the pseudopotential construction 
of Troullier and Martins [15], taking the nonlocal part by the method of Kleinman and 



Bylander [16| and including partial core corrections for the exchange-correlation energy 
|T7| . In this method, the local pseudopotential is fixed to the value in a particular angular 
momentum channel, and a nonlocal correction is made for other channels. Here we use 
the I = 1 potential as the local potential, and apply the nonlocal correction to the I = 
and I = 2 channels. The electron-electron interaction is taken in the simple local-density 



approximation (LDA) given by Perdew and Zunger |Tj|. More complicated functionals have 
better predictive power for ground state properties but give only small improvement 
to the optical response of neutral molecules. The proper description of the 1/r asymptotic 



behavior of the potential is going to be very important to describe charged systems, however 
for the aim of the present work this LDA deficiency is not relevant. 

The geometry of the Nas cluster was computed in ref. ||, and the lowest energy struc- 
ture found to be the bicapped octahedron (D 2 d symmetry). We use this structure in our 
comparison here. It has an average Na-Na bond length of 3.38 A and a slight deviation from 
the spherical symmetry. This leads to a polarizability tensor with two different components 
and two close-lying peaks are obtained in the photoabsorption cross section. 

II. THEORETICAL METHODS 

Before describing in detail each of the two methods for representing the wave functions 
(direct and Fourier space), we need to comment on the choice of the spatial cell size and 
mesh size as well as the time/frequency parameters (all are summarized in Table I). 

Since the wave functions are sensitive to boundaries, the calculations must be made in 
a volume several Angstroms larger than the size of the molecule or cluster. Using both 
methods, we determined how large a volume is needed to achieved 0.1 eV accuracy on the 
various excitation energies of interest in the system. We found that this is achieved in a 
spherical volume of radius R = 8 A using the NP code, and in a simple cubic supercell of 
side 12.7 A using the CMP code. These have nearly the same volume, and thus the same 
average distance from the cluster to the boundary. We have checked the convergence of 
the results by increasing the volume to a sphere of 12 A radius. The value of the plasma 
frequency is reduced by a maximum of 0.1 eV, that is, within the required accuracy. 

We have used an uniform spatial grid with Ax = 0.5 A spacing. This corresponds to 
a plane- wave cutoff energy of 6 Hartrees in the Fourier space method (see below). Within 
this parameters, a stable time-step to perform the time-evolution in the NP method is 
At = 0.003h/eV << h(Ax) 2 /m. The required O.leV accuracy in energy is obtained for total 
simulation times of 10 h/eV. Similarly in the Fourier space method we have taken a uniform 
frequency grid of M u = 100 between and 5 eV. Note that if the response is required for 
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larger frequencies we need to increase the number of points. The whole response is obtained 
at once in the time evolution method (unless up to energies of the order of (At) -1 ). This is 
a great advantage when the whole response is needed. 

A. NP method 

This method uses a direct solution of the time- dependent single-electron Schrodinger 
equation, 

if dU^t) =gjfS (t)0 i ( r ,t) (i = l...occ.) (1) 
where H^s is the Kohn-Sham Hamiltonian operator 

H K s(t) = -t^V 2 + VWr) + e 2 / dV^4 + V **( T > *) ( 2 ) 
2m J |r — r'| 

and n is the time-dependent electron density n(r, t) = J2i=i <P*i ( r ) ^)<M r 5 I n the solution 
of this equation in the spatial and time variables following the algorithm of ref. there 
are two time-consuming operations. One is multiplying the single-electron Hamiltonian 
operator by the vector representing the wave function. The dimensionality of the vector is 
the number of mesh points Nr times the numbers of electron orbitals N e . The operator is a 
sparse matrix with Mh nonzero elements per row. Thus the basic operation requires about 
N s NrMh complex floating point operations. The time evolution operator in the NP code is 
implements by a power series expansion of the exponential operator exp(— iHAt) to fourth 
order. A predictor-corrector cycle requires two such operations. Thus the method requires 
8 Hamiltonian multiplications per time step. Thus for time steps the total number of 
floating point operations is given by 

NPFPO : WN e N R M H M T 

The sparseness of the Hamiltonian matrix in a real space formulation is determined by 
the finite difference formula for kinetic energy (nine-point formula in our case); and by the 
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nonlocal-projection parts of the potential. In total we have a number of non-zero elements 
of the each Hamiltonian row M H 100 for the grid parameters used for Na 8 . 

The other time-consuming part of the NP algorithm is solving the Poisson equation, 
which must be done twice at each time step. The NP code uses a multipole expansion 
combined with a relaxation method to deal with the higher multipoles. It is hard to estimate 
the scaling properties of this part, but in the present study this part of the computation takes 
1.5 times as many operations as the Hamiltonian multiplication operation. We shall assume 
the same factor for estimating the scaling properties of the algorithm. In principle, the 
Poisson equation can be solved by methods that are of order Nr or N^logNji, as multigrid 
or fast-Fourier transformation, so this part should not dominate for large system. 

Storage requirements are small: the vector wave function plus Vuartree and V ion local 
potentials in Hamiltonian, charge densities and some intermediate arrays. Vu artree requires 
a slightly larger volume because of the way the Poisson equation is solved. 

NP storage : N R (N e + 4.5) 

This NP method is ideal to be combined with molecular dynamics simulations for the 
ions because it uses only ground-state occupied information and would scale roughly linearly 
with the number of atoms in the system. There is not so much book-keeping as in the usual 
perturbative formalism (no need for storing the large set of unoccupied wave-functions and 
the large dielectric matrices). 



B. CMP method 

Here the basic object of the calculation is the linear response to an external field of some 
frequency uo. The linear response matrix \ is constructed in momentum space with the 
following matrix inversion 

X = (1 - XoKy'xo (3) 
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where the independent particle response xo and the interaction K are matrices defined as 
follows. The Xo has elements G, G' given by |T9| 



where Vt denotes the unit-cell volume, i, k label Kohn-Sham eigenfunctions and and fk 
are the corresponding eigenenergies and occupancy factors. The sum goes over N e occupied 
orbitals and N c empty orbitals. The interaction K is the Fourier transform of the electron- 
electron interaction in the Kohn-Sham equation, which is given in coordinate space by 



We now describe the computation starting with the Kohn-Sham wave functions and ener- 
gies in a momentum space representation. To evaluate the independent particle response Xo 
in eq.( ^), one first calculates the particle-hole matrix elements of the momentum operator 
and stores them in a table (or in disk). This computational effort is of the order of N s N c Nq 
operations, and the table size to be stored is N e N c Nc complex numbers. Then the evalu- 
ation of eq. (4) requires Nq matrix elements to be calculated, each requiring particle-hole 
summation, to give m 2NqN e N c operations for each frequency. If one were to make full space 
calculation, the number of empty orbitals summed in eq. (4) would be of the same order 
as the dimensionality of the space. However, the number of empty orbitals can be severely 
truncated without effecting the long-wavelength dipole response. In the example, we find 
N c = 320 is adequate, which is more than an order of magnitude smaller than the size of the 
space and corresponds to include unoccupied states up to 20 eV above the highest-occupied 
orbital. This is a reasonable approximation as we are interested only in getting the optical 
spectra for excitation energies below 10 eV. This approximation is an important saving in 
building up the response matrix. 

One also truncates the calculation of the response matrix in another way. We have also 
assumed that the off-diagonal elements of the response function are zero for G- vectors outside 
an sphere of 1.25 A (that is to consider « 3200 points in the G-space). This corresponds to 






(5) 
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reducing the number of matrix elements to be computed and stored to Ng(Nq + 4) /18. Note 
that the necessity to store the Nq matrix puts a higher demand on the computer memory 
than the NP method. The memory required to store the complex, double-precision 
numbers in the example problem is 164Mb. 

There are now three steps to evaluate eq. (3), two matrix multiplications and a matrix 
inversion. The matrices are not sparse, so the matrix multiplications each cost 2(Ng/3) 3 
arithmetical operations []. The matrix inversion is of the same order, requiring (JVg/3) 3 
operations. The total is ~ 5(Ng/3) 3 . These represent the most computationally demand- 
ing steps in the CMP method, given the truncation in the N c . The computed \ is next 
transformed to the coordinate space representation. Using the fast Fourier transform, this 
takes ~ N G logNc operations. The dynamical polarizability can be now computed from 

matrix times vector multiplication. From this one can easily extract 
the photoabsorption cross section <j(uj) = — 7ma(w). 

Then the total computational effort in the CM method is: 

CM FPO : MUN c N e (N G /3) 2 + 5(iV G /3) 3 ) 

with the last term dominant. The storage requirements for all the occupied and unoccupied 
wave functions plus the whole complex response matrix is 

CM storage : (N e + N C )N G + 2(N G + N G /9) + N c N e N G /3 

To achieve the targeted energy convergence with this algorithm, the momentum space 
mesh was chosen to correspond to a simple cubic supercell of L = 12.7 A on a side. This 
implies that the mesh spacing in momentum space is 5k = 2n / L = 0.137 A. The momentum 
space representation takes all the points within a sphere of radius k max = 1.83 A (that 



A small technical point should be mentioned, associated with the divergence of the Coulomb 
interaction at G = G'. This is dealt with |]19| by taking a numerical limit as |G — G'| — ► 0, and 
this adds about 10% to the number of operations for computing the matrix product. 
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corresponds to a plane-wave cutoff energy of 12 Ry). The size of the vector in the momentum 
representation is thus Nq = 47r(/c ma , r /A/c) 3 /3 ~ 10,000. Note that this is slightly smaller 
than the number required for the coordinate space representation, however we need to stress 
that a larger number of G-vectors are needed to describe the action of the potential on a 
wave-function (Vip corresponds to a convolution in Fourier space). Finally, an additional 
numerical parameter is the imaginary part of the frequency i], which we have taken as 
77 = 0.05 eV to produce a resolution of 0.1 eV in the spectral features. 

In the discussion below we have not include the computational requirements to perform 
the ground state calculations, occupied and unoccupied orbitals. This could be a major 
storage bottle-neck for very large systems as the calculation of a large set of unoccupied 
wave functions has a cubic scaling of the number of atoms in memory and computing time. 
In the present calculation this initialization process takes 10% of the total computational 
time. 



C. Other methods 

We mention here two other methods from a computational point of view. Since we 
have not carried out numerically computations on our test problem with these methods, the 
discussion will be brief. 



1. Modified Sternheimer method 
The modified Sternheimer method was first applied to the time-dependent Kohn-Sham 



equation for atomic excitations [20], and has since been applied to the dielectric response 



of crystals using the momentum space representation [[Z1J and to the finite system Cqq |22 
using the coordinate space representation. Here one solves an inhomogeneous equation for 
the perturbed wave functions 4>f using an iterative method. The perturbation is a sinusoidal 
potential field combining the external field V ext and the internal field from the time-varying 
electron density. The equations are 
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(e I -H° KS ±u+ir l )<^ = PV i 



(6) 



where 



Vi = (V ext + KSn)^ 



and 



6n = ReJ2<f>i(<f>t + <f>i)- 



(7) 




P is a projection operator removing occupied orbitals. In ref. [^], the two equations are 

constructed in coordinate space and solved with a double iteration. One makes a guess for 
the density 5n, and solves eq. (§) by the conjugate gradient method. 5n is refined from 



convergence. The numerical cost will thus depend largely on the cost of the Hamiltonian 
operation which is ~ MhNrNc in coordinate space, and the number of iterations Mu re- 
quired to get a converged solution. Remembering also that frequency space methods need 
Mb,, the number of frequencies to be examined, the computational cost of this method is 



The method can be used in this form for nonresonant frequencies, but near the eigenfrequen- 
cies the nearby singularities in eq. (6) must be removed for the conjugate gradient method 
to converge. Thus this method would be similar to methods utilizing the particle-hole rep- 
resentation in needing a considerable number of the wave functions and eigenenergies of 
unoccupied states. The singularities are removed by projecting on the unoccupied wave 
function subspace the right hand side of eq. (6), 



the resulting <j)f again with the conjugate gradient method, and the process is repeated to 



Modified Sternheimer (real space) : M^MuMnNuNg 



(8) 



v7 = ^-E<M<^)- 



The desired wave functions <j)f are obtained from the projected solutions by 




— ijJ — 17] 
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It is difficult to give an a priori estimate of Ma or its size-scaling properties (although 
with our notation we have assumed that it does not grow with N). Unfortunately, our 
implementation of eq. (7) still left the convergence somewhat erratic. Typically it takes of 
the order of ~ 1000 iterations of the double loop to get convergence. Thus it would 
require some improvement of the algorithm to make it attractive to apply to large systems. 

The momentum space implementation of the modified Sternheimer method is similar. 
This method also needs the conditioning step for convergence of the CG iteration. The main 
difference is in the Hamiltonian multiplication, which here requires m 2(iV r G'/3) 3 operations 
as discussed in Sect. IIB. Thus the total is 



Because the Hamiltonian operation is more costly in momentum space, this method is prob- 
ably not competitive to the others, unless it were the case that the convergence of the 
iteration were intrinsically much more reliable. 



The frequency-space methods discussed so far have relied in some way on operator inver- 
sion. It is also possible to cast the problem as one of matrix diagonalization. This method 
was applied to cluster excitations in the TDLDA by Vasiliev et al. ||. The authors start 
from a basis in coordinate space and construct Kohn-Sham orbitals for both occupied and 
empty states as is done in the CMP method, but representing the orbitals in coordinate space 
mesh, as in the NP method. The storage requirement for the orbitals is ~ (N c + N £ )Nr, 
which is larger than in the NP method but smaller than in the CMP method. 

The next step of the calculation is to construct the matrix to be diagonalized. The 
eigenvalue equation to be solved is 



Modified Sternheimer (momentum space) : 2M aJ Mj 4 (A r c/3) 3 



(9) 



2. Diagonalization method 



RF 



n 




n 



(10) 



where F n are the eigenvectors and R is a matrix. Its elements are 
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Ra,a' = (Ct - £j) 2 $a,a> + 2 \f i e i ~ e j)( e i ~ ^j)^a,a' (H) 

where the indices a = (ij),a = (i',f) label combinations of unoccupied orbitals i and 
occupied orbitals j. The interaction matrix elements K a ^ a i are simply the particle- hole 
matrix elements of the residual interaction, eq.(|5]). There is a substantial computational 
cost in construct the interaction matrix K. A straightforward transformation from the 
coordinate space to the particle-hole representation requires ~ N^N^N^ operations for the 
Coulomb interaction. However, this is reduced considerably by using an efficient method to 
solve the Poisson equation ||. For example, using the fast Fourier transform one may find 
the Coulomb field for a given particle-hole state taking only iV^ log Nr operations. Saving 
the Coulomb field in the coordinate representation, the matrix element to a given final state 
takes ~ Nr operations. The effort of solving the Poisson equation is thus distributed over 
the number of final states, and the operations to construct the full matrix has a leading 
dependence NrN^N^, the scaling appropriate for the local part of the interaction^. Once 
the matrix is constructed, the diagonalization requires ~ (N c N e ) 3 operations. However, 
taking the N values from Table |, the matrix diagonalization effort is small compared to 
that needed to construct the matrix. We have therefore taken that step to assign this 
method's size scaling in Table II. 



III. NUMERICAL RESULTS 

We will discuss in detail the physical quantities computed in the NP and CMP methods 
and refer to || for the results using the diagonalization method. We want to stress that the 
three approaches must give the same values if the numerical parameters are chosen with fine 
enough grids and large enough cutoffs to get converged results. 

With the parameter sets chosen for the two methods, the results are quite similar. In 



Table IV we show calculated Kohn-Sham energies and the surface plasmon energy. The 



2 However, in the implementation of ref. ||, the Poisson solver in fact is the most costly operation. 
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first entry e\ is the Kohn-Sham eigenvalue of the most bound orbital. The absolute energies 
have no significance in the supercell method, because the absolute Coulomb potential is 
undefined. Therefore, for this entry we give the value from the NP code and set the scale 
of the CMP energies at that value. The next three rows correspond to the other bound 
orbitals use the G = point of the Brillioun zone for the CMP values. We can see that the 
methods agree to within less than 0.1 eV. The next entry is the lowest unoccupied orbital. 
This is significantly different for the two methods. This orbital has sufficient extension to 
have its energy sensitive to the boundary, which of course is different for the two methods. 
We confirm the boundary sensitivity in the CMP code by calculating the energies at other 
points in the Brillouin zone. Differences are less than 0.1 eV for occupied orbitals, but reach 
0.2 eV for the lowest unoccupied orbital. This last point indicates the fact that the empty 
orbitals are more sensitive to the boundary conditions and in the periodic supercell they feel 
the potential from the other clusters. 

We have also presented in Table [TV] the results of the NP method Q. We have also checked 
the convergence of the plasmon frequency with respect to the cell size and found that this 
value is converged to less than O.OleV for a sphere of R=12 A. The fully converged value 
in the NP method is 2.65 eV. The difference with the experimental value of 2.53 eV can be 
attributed to deficiencies in the LDA approximation as well as for finite temperature effects 
in the experiments ||. 

In Table |V| we summarize the results for the static averaged electrical polarizability of 
Na§ obtained by the different methods. The agreement among the different approaches 
is very good and the remaining difference with experiments can be again assigned to core 



3 The plasmon frequency is sensitive to the core-exchange correction at the level of 0.1 eV. We 
have included that correction in Hks it improves the description of the structural properties of 
Na metal. We note that the result without core corrections (2.89 eV) it is very close to the jellium 
value (2.9 eV). 
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polarization, exchange-correlation and temperature effects. These effects tends to increase 
the polarizability bringing the computed values close to the experiments 

IV. CONCLUSIONS 

In the theory of electronic excitations of finite many-electron systems, the time- dependent 
Kohn-Sham equation with an adiabatic local density approximation for the interaction en- 
ergy function offers an attractive compromise towards the goals of accuracy and computa- 
tional practicality. But even within the TDLDA scheme there are several methods in use, 
and our purpose was to compare them on the same footing by applying them to the same 
physical problem, and demanding the same accuracy. The goal is to gain a general under- 
standing of the numerical resources (total numbers of arithmetic operations and computer 
memory) required by the different methods. One can then extrapolate to large systems and 
make a judgment on which methods offer the best prospects. 

We have only considered methods based on a grid representation of the electron wave 
functions, and have concentrated on two algorithms, the NP method in real time and real 
space, and the CMP method in Fourier transformed time and space. 

We chose to study the response of the Na 8 cluster around the surface plasmon excitation 
energy. The two methods turned out to have comparable requirement on arithmetic oper- 
ations. However, it should also be noted that the computational work increases with the 
range of frequencies that one studies in the CMP method, but not in the NP method. With 
latter, the entire response is obtained from a single calculation. 

In comparing the two methods to ascertain their scaling with the size of the system N, we 
have deliberately ignored the first task in either method, the construction of the eigenstates 
of the static Kohn-Sham operator. In the NP method only the occupied orbitals are needed, 
but in the CMP method one also needs a large number of unoccupied orbitals as well. Their 
calculation scales like in principle, but in practice this phase of the computation is short 
compared to the dynamic calculation and so we ignore it. Let us now compare the scalings 
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by taking the expressions in Table II, dropping the subscripts on the N quantities. The 
NP method thus scales as N 2 . This behavior was also found studying the excitations of 
long carbon molecules ||. The CMP method has a poorer scaling behavior, namely N 3 . 
We also considered two other methods without however examining them in as much detail. 
In principle, the modified Sternheimer method in coordinate space can achieve N 2 scaling 
without the cost of the large Mt factor of the real-time method. However, we did not find 
a reliably converging iteration procedure to solve the basic inhomogeneous linear equation 
set. The final method we discussed, the diagonalization method using real space and Fourier 
time, seems to have a poorer iV-scaling than the others, but may be advantageous in some 
circumstances (see below). 

Besides arithmetic operations, storage can play a role in the practicality of the different 
algorithms for large systems. Here we find that the storage requirements are grossly different 
for the NP and CMP methods, favoring the NP approach. From Table II, it has a N 2 scaling 
while the CMP method has an iV 3 behavior. This is already significant in the Na§ system 
we studied, as may be seen from Table III. 

Thus our results favor the real-time and real-space methods, offering economy in both 
storage and arithmetic operations. However, there are a number of caveats. We have not 
considered the suitability of the different algorithms for parallel computing. In a parallel 
computing environment, the frequency-space methods gain favor because the factor can 
be trivially absorbed in the parallel processing. In addition, the diagonalization method can 
benefit from the parallel computation of different rows of the matrix. Also the sparseness 
of the Hamiltonian matrix is important for the real space method; this would be lost if for 
example the energy functional used the full Fock exchange interaction. 

Finally, we mention two nonnumerical benefits of the real-time method: as was said 
earlier, it is nonperturbative and therefore allows effects of large fields to be calculated with 
the same effort. And it uses the same energy functional (permitting the program to call 
the same subroutine) for the dynamic calculation as for the static calculation to prepare the 
ground state. 
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TABLES 

TABLE I. Symbol definitions for quantities pertaining to the computational effort required by 
the various algorithms discussed in the main text, and their values. 



Symbol 


Meaning 


NP method 


CMP method 


iv± l 


tllTlP stPDS 


10 4 




M 
ivi tjj 


Tininhpr of TrpniiPTiPiPS 




10 


M H 


nonzero elements in H matrix row 


100 




M lt 


iterations in conjugate gradient method 






N R 


real-space points 


17, 000 




N G 


reciprocal-space points 




9,771 


N e 


number of electron orbitals (occupied states) 


4 


4 


N c 


unoccupied states 




320 



TABLE II. Leading-order for the size scaling of various algorithms for TDLDA-general com- 
parison: floating point operations (FPO) and memory requirements. 



Method 


FPO 


Memory 


NP 


N e N R M H M T 


N R {N e + A.h) 


CMP 


5MUN G /3) 3 


iV3/9 


Modified Sternheimer 


M u Mn Mh N e Nr 


N R (N e + N c ) 


Diagonalization 


N 2 c N 2 e N R 


(N c N e ) 2 



TABLE III. Comparison of computational difficulty of NP and CMP methods for N&g 



Resource 


NP 


CMP 


Memory (MBy) 


7 


350 


Floating point operations 


1.5 x 10 12 


1.7 x 10 12 
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TABLE IV. Orbital energies ej and surface plasmon energy lom in Nag. For comparison in 
parenthesis we show the result of a calculation within the NP method without including partial 
core corrections in the pseudopotential generation and time evolution. 



hiTlPT'P'v 


NP 


CMP 


Exn feV") 

1 JA U . \ C V / 


£l 


_i ±.uo 


-*±.uo 




£2 


-3.41 


-3.35 




£3 


-3.00 


-2.97 




£4 


3.00 


-2.97 




e 5 (LUMO) 


-1.88 


-2.01 






2.77 


2.6 


2.53 [23| 



TABLE V. Static polarizability of Na 8 (A 3 ) 



Exp. 


NP 


Atomic 


CMP 


All-electron 


128.7 


24 


1 


103 


117 | 


119 


114.9 1 25] 
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